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(37} ABSTRACT 

A hierarchical filing system provides a cataloging of 
data stored in various locations within a memory de- 


vice. An upside-down tree type structure provides a 
cataloging structure wherein 


directory identifier value of its parent to provide the 
interconnection necessary to form the cataloging struc- 
ture. The complete cataloging structure is organized in 
the leaf nodes of a B-Tree structure and distributed in an 
ascending order of the key values to provide a system- 
atic search for a given key. Each file is capable of stor- 
ing a predetermined number of location description 
information when data is segmented into non-contigu- 
ous segments in memory. A file extents record is used to 
maintain record of the further segmentation. File loca- 
tion information is kept in the form of file exterts de- 

in the leaf nodes of the separate File Extents 
B-Tree. This extents information is sorted in an ascend- 
ing order based on a key comprised of a unique file 
number of a file relative starting block location of the 
file extent. 


6 Claims, 5 Drawing Sheets 
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HIERARCHICAL FILE SYSTEM TO PROVIDE 
CATALOGING AND RETRIEVAL OF DATA 


This is a continuation of application Ser. No. 924,802 5 


filed Oct. 30, 1986 now abandoned. 


BACKGROUND OF THE INVENTION 
1, Field of the Invention 


The present invention relates to the method of storing 10 


and retrieving data using a computer, and more specifi- 
cally to a hierarchical filing system. 

2. Prior Art 

In a computer system, information is typically stored 


as signals on various storage mediums, such as magnetic 15 


tapes, disks, semiconductor devices, eto. As storage 
densities increased with advances in storage device 
technology, it becaune possible for a device to store 
much more information than previously. - 


When information is stored on a device, it is cata- 20 


loged so that the same information is later retrieved 
when desired. Normally, a unique code name is attrib- 
uted to a particular body of data to differentiate it from 
others. To retrieve a desired body of data, an appropri- 
ate code name associated with that data is used, 
the device searches for that code name and retrieves the 
desired data when that code name is found. . 

It is well-known in the prior art that each separate 
body of data is termed a file and the cataloging of these 


files on a device is termed filing. Typically, code names 30 


associated with particular data contain pointers which 
point to areas in memory reserved for mass storage. The 
various code names and their pointers comprise the 
cataloging system. When high-density 


storage devices 
are used, millions of bits of information are capable of 35 
undreds, 


being stored on such a device, which permits h 
thousands, and even millions of files to be created. To 
search through these files in a serial fashion to look for 
a specific file is time-co 


nsuming. 
It is appreciated that what is needed is a filing system 40 
’ for a high-density storage 


medium which rapidly 
searches and retrieves the desired file stored. Further, 
with the advent of the personal computer (PC) and the 
smal] business computer, where physical size is a con- 


cern, it is desirable to have a filing system which may be 45 


Rapesiected isa: leoeet Ene: of prcemnems eee re 


SUMMARY 


A method for providing a hierarchical filing system is 50 
hierarchical 


described. The filing system provides a 
catalog of the data stored in various locations within a 
__ memory device. Typically, one cataloging structure is 
used to organize a volume of mémory. 


The cataloging structure of the hiearchical filing 55 


system is provided by an upside-down tree type struc- 
ture wherein there is a starting directory which oper 
ates as a root directory. Other directories and files ema- 


nate as off-spring. A plurality of descendant levels 


branch downward to provide the hierarchical structure 60 


of the catalog. The cataloging structure contains the 
location information of where the actual data is stored. 

The file cataloging system is implemented using a 
B-Tree. The cataloging information is kept in the leaf 


nodes of the B-Tree. The non-leaf nodes (index nodes) 65 


of the B-Tree contain information that allows searching 
for particular catalog information by using the code 
name or key of the corresponding file. Key values, 


wherein 25 


2 

which are used to identify and catalog various files in 
the cataloging system, are also used to orgeaize the 
catalog in the leai nodes of the B-Tree. The keys are 
placed in an order for systematic access. © 
Further, the B-Tree grows by using left rotates and left 
splits with insertion of catalog information about new 
files from the right to maintain a balanced tree. 

When a file's data is stored, additions, deletions and 
modifications will typically result in non-contiguous 
physical storage of the data in the memory device. Each 
of the contiguous segments of the file is known as a file 
extent. A record of the physical location of the extents 
for a particular file is maintained in one or more extents 
records. The hierarchical filing system uses a file extents 
list to maintain the extents records of the various files on 
the memory device. 

The present invention maintains the first extents re- 
cord of a file in the cataloging structure, but any further 
extents records are maintained in a separate file extents 
list. This file extents list is also implemented in a second 
B-Tree structure. 


BRIEF DESCRIPTION OF THE DRAWINGS 


FIG. 1 is a representation of a prior art flat filing 
system. 

FIG. 2 is a representation of a hierarchical filing 
system of the present invention. 

FIG. 3 is a representation of a B-Tree structure of the 
present invention. 

FIG. 4 is & representation of contents of a node for ~ 
the B-Tree structure of FIG. 3. - oe. 

FIG. 5 is a representation of a left-split and a left- 
rotate operation of a B-Tree structure of the preferred 


embodiment. 

FIG. 6 is a representation of a cataloging structure of 
the preferred embodiment and an organization of the 
cataloging structure in various nodes of a B-Tree. 

FIG. 7 is a representation of a volume allocation 
mapping in a filing system of the preferred embodiment. 

FIG. 8 is a representation of a file extents list of the 
preferred embodiment and showing various file extents 


in memory. 

FIG. 9 is a representation showing the file extents 
organization in the Catalog and Extents B-Trees of the 
preferred embodiment. 


DETAILED DESCRIPTION OF THE 
PREFERRED EMBODIMENTS 


The present invention describes a method of storing 
and retrieving information using a hierarchical filing 
system. In the following description, numerous specific 
details are set forth in order to provide a thorough 
understanding of the present invention. It will be obvi- 
ous, however, to one skilled in the art that the present 
invention may be practiced without these specific de- 
tails. In other instances, well-known methods have not 
been described in detail in order not to unnecessarily 
obscure the present invention. 

Referring to FIG. 1, a prior art flat filing system 10 is 
shown having a directory 11 and files 12-15. For ease of 
understanding, a directory is shown pictorially as a 
folder and a file is shown as a sheet of paper with a 
folded corner. The pictorial representation applies well 
to an analogy of placing papers into folders (files into 
directories). In the prior art system 10, there is present 
a single directory 11, which contains locator informa- 
tion for files 12-15. Each of the files 12-15 contain data 
which would be associated with a specific body of 
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stored information. In this particular example of a prior 
art system 10, to access file 15, a serial search is made 
through directory 11, until the file address of file 15 is 
located, such sequential search resulting in considerable 
lapse of time when substantial numbers of files exist in 
the directory 11. Although in this hypothetical exam- 
ple, directory 11 maintains pointer addresses to four 
files 12-15, directory 11 will continue to store addresses 
of subsequent files in a sequential fashion. 

FIG. 2 illustrates the architecture of the Hierarchical 
Filing System (HFS) of the present invention. This 
particular HFS 16 includes a root directory 17 and files 
314-24. The HFS 16 also includes directories 18-20. 
Esch directory is capable of containing files, as well as 
other directories such as directory 18 containing direc- 
tory 20. Each directory is a branching node, allowing 
for none or a plurality of sub-branching nodes. Each 
directory contains infurmation which permits the 
branching to occur. The actual data is stored in the files 
21-24. Because each file is a termination node, it does 
not need to maintain further branching information. 
Instead, each file stores the actual data. Therefore, the 
directories 17-20 maintain branching information, 
while files 21-24 contain the stored data. 

HFS 16 accesses files 21-24 in a hierarchical fashion 25 
90 that serial search for the files is not necessary. As- 
sume in the example of FIG. 2 that access to data stored 
in file 23 is desired. A search of directory 17 reveals that 
two possible paths exist in seeking the address of file 23. 


20 


One path from directory 17 leads to directory 18 and 30 
the other path leads to directory 19. The desirable path 
is to directory 18, at which point there are again two 
paths. The desirable path from directory 18 leads di- 


rectly to file 23. Although this example is simplistic 
because of the miniscule number of files shown, one can 35 
appreciate the file search time saved when a substan- 
tislly large number of files are present. 

: Further, as an example, if file 22 had been chosen, the 
path from directory 18 would have led to directory 20, 

“ at which point two paths exist from directory 20. The 4 
a desirable path to file 22 from directory 20 then would 

” have been chosen. HFS 16, although shown in a partic- 
ular form in FIG. 2, may have any number of levels 
(vranchings) down from the root directory 17 as well as 
any number of branches from a particular directory. 45 
However, it is to be noted that all data is stored in the 
represented files 21-24 which are all located at the 
termination nodes of HFS 16. 

In actuality, the cataloging architecture of the pre- 
ferred embodiment contains cataloging locator descrip- 50 
tion information in the HFS 16 structure. The catalog 
entries for files 21-24 contain pointers which provide 
locator descriptions to locate places in storage area 
where actual stored data is kept. 


B-TREE 


The HFS of the present invention is implemented 
using two B-Tree structures in the preferred embodi- 
ment, the Catalog B-Tree and the File Extents B-Tree. 

A B-Tree structure is well-known in the prior art and is 60 
described in The Art of Computer Programming Volume 
3 (Sorting and Searching); by Donald E. Knuth; at 
Section 6.4; titled “Multiway Trees"; pp 471-479 
(1973). The nodes of a B-Tree contain records, wherein 
each record is comprised of certain information, either 65 
pointers or data, and a key associated with that record. 

Referring to FIG. 3, a hypothetical B-Tree is illus- 

trated. A basic feature of the B-Tree 31 is that data is 


35 


_ index node 


4 
stored only in leaf nodes 3$-38. The internal nodes 
32-34, also known as index nodes, coatain pointers to 
other nodes such that chese index nodes 32-34 provide 
an index for accessing the data records stored in the leaf 
nodes 35-38. Each record 39 includes a key 40 and an 
information segment 41. Within each node, the records 
are maintained so that their keys are in ascending order. 
The example B-Tree 31 of FIG. 3 contains hypothetical 
keys which have been inserted to show the structure of 
the tree, and the relationship between index nodes 
32-34 and leaf nodes 35-38. Leaf node 35 contains key 
values 48 and 50. The first key of a node is also repre- 
sented as a key in its ascending node. Therefore key 48, 
which is the first key of leaf node 35, is also represented 
as a key within index node 33. Key 33, which is the first 
key of leaf node 36, is represented as the second key of 
33. Also, because key 48 is the first key 
within index node 33, it is again represented as a key 
within index nude 32. This pattern is repeated for each 
leaf node 35-38 and each ascending index node 32-34 
for a B-Tree structure. Although FIG. 3 shuws only 
three levels and two keys per node, any number of keys 
per node, as well as any number of levels, may be 
chosen for a particular B-Tree structure. B-Tree 3 of 
FIG. 3 a8 drawn is a hypothetical example for illustra- 


the leaf node have been examined. The key values may 


or umeric. 


of the various nodes at a given level. For each node, 
NDBLINK 52 contains a pointer to the previous node, 
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5 
and NDFLINK §1 contains a pointer to the subsequent 
node at the same level. In FIG. 3, NDBLINK for node 
36 would point to node 35 and NDFLINK for node 36 
would point to node 37. Therefore, NDBLINK 52 and 
NDFLINK 51 are means of locating adjacent nodes 
without first reversing back up the B-Tree. 

The records segment 44 contains the B-Tree’s re- 
cords, each with its key and pointer or data information. 
In this particular example, there are two records 60 and 
61. The records in a node can be of variable length. For 
this reason, offsets to the beginning of each record are 
needed. The records segment begins immediately fol- 
lowing the node descriptor segment 43. The records are 
followed by a free space segment 45, which is basically 
the unused space of the node. Therefore, free space 
segment may not exist in some instances. The record 
offset segment 46 at the end of the node contains the 
offset information for records 60 and 61. Offset 68 con- 
tains offset information for record 60 and offset 67 con- 


tains offset information for record 61. Offset 66 contains 20 


the offset necessary to determine free space 62. Thus the 
record segment 44 builds downward into the free space 
segment 45, while the record offset segment 46 builds 
upward into the free space segment 45 from the oppo- 
site end. 

If node 42 is an index node, then each record 60 and 
61 is comprised of a key and pointer information. Fur- 
ther, NDFLINK $1 and NDBLINK 52 would contain 
adjacent index node linking pointers. If node 42 is a leaf 


node, then each record 60 and 61 is comprised of a key 30 


and data information. NDFLINK 51 and NOBLINK 
§2 would also contain leaf node linking pointers. It is 
also appreciated that although a particular format is 
illustrated for node 42, the format may be 


readily to include other types of information. Also, in 35 


the preferred embodiment data information in the leaf 

nodes of the HFS catalog B-Tree is used to address 

locations in memory where the actual data is stored. 
Referring to FIG. 5, a specialized B-Tree expansion 


architecture as implemented in the preferred embodi- 40 


ment is shown. A node 70, which is equivalent to node 
42 of FIG. 4, is shown having pointers to two lower- 
level nodes 71 and 73, which may be index or leaf 
nodes. Although only two nodes 71 and 73 are shown at 
the lower level, any number of nodes may reside at this 
lower level. Also in this hypothetical exam- 
ple, nodes 71 and 73 are only partially filled. 

For a B-tree to maintain its balance, records must be 
kept uniformly spaced within the hierarchical structure. 
An unbalanced tree will result when records are not 
maintained uniformly in each node or nodes are heavily 
stacked toward one branch of the B-Tree. The pre- 
ferred embodiment uses a technique of left rotate and 
left splits to provide movement of records from one 
node to another to maintain a balanced Tree. When 


records are to be transferred to another node, the left - 


rotate operation is used. In this instance, records in node 
73 are left rotated to its left adjacent node 71, as shown 
by arrow 77. 

If another node is needed, such as when records in 
node 73 must be rotated and node 71 cannot accommo- 
date records from node 73, a left split operation is used 
to insert node 72 to the left of node 73, between nodes 
71 and 73. In this instance, node 72 is inserted to link 
node 71 and node 73, as shown by arrows 78. When 
node 72 is inserted, appropriate pointer links will be 
established with its index node 70 as well as adjacent 
link pointers for nodes 71 and 73. Continually moving 


2. 


6 
data leftward and inserting new data at the right ex- 
tremities helps keep the B-tree balanced. Because the 
HFS of the present invention is structured to have the 
ascending nodes organized in a rightward direction, the 
balancing is maintained even though the rotates and 
splits are made toward the left direction. It is appreci- 
ated that right splits and rotate operations, or balanced 
insertions using both right and left operations can be 
used as well. Although the preferred embodiment uses 
and attempts to maintain a balanced B-Tree for search 
efficiency, most any B-Tree structure can be used, in- 
cluding unbalanced B-Tree. 


CATALOG TREE 


Referring to FIG. 6, a hypothetical catalog 90 is used 
to illustrate the implementation of cataloging of the 
preferred embodiment. The structure 90 has a root di- 
rectory 91 named “Volume”. Each directory of the 
preferred embodiment is assigned a unique numerical 
identifier known as the directory idertitier (DirID). 
The root directory of catalog 90 has DirID value of 2. 
Root directory 91 has three branches comprised of 

92 and files 93 and 94, Directory 92 has a 
name of “Folder” and a DirID value of 29. In turn, 
directory 92 has two branches comprised of files 95 and 
96. Files 93-96 are named “A”, “B”, “C” and “D", 
respectively in this example. The architecture of the 
directories and files follows the HFS structure as previ- 
ously explained in FIG. 2. The complete cataloging 
structure 90 is stored as data records in various leaf 
nodes of the B-Tree of FIGS. 3 and 4 known as the 
catalog B-Tree. It is appreciated that the cataloging 
structure 90, although a tree, is in itself not a B-Tree. 
The form of structure 90 is actually stored in the various 
leaf nodes of a B-Tree. It is to be appreciated that the 
cataloging structure 90 not be confused with the previ- 
ous description of the B-Tree. Catalog 90 and the B- 
Tree structure are two separate and distinct structures. 
The hierarchical structure of the catalog 90 is imple- ° 
mented as a B-Tree structure and stored as data records 
in leaf nodes of a B-Tree similar to that of FIGS. 3 and 
4 


The hierarchical catalog structure 90 is stored in a 
storage device as shown by a memory map 97 of FIG. 
6. Cataloging map 97 is comprised of three possible 
types of records: directory records 100, file records 101, 
and thread records 102. Each record 100-102 is com- 
prised of a key 103 and information segment 104, as 
earlier described in the description of a leaf node of a 
B-Tree. The key 103 of each record is comprised of a 
value 105 and a name 106. The key 103 of a directory 
record, such as that of 91 and 92, is comprised of its 
directory name 106 and its parent directory’s DirID 
value 105. A information segment 106 of each directory 
record, such as that of directories 91 and 92 is com- 
prised of the directory's DirID value 107. For directory 
92, the directory’s DirID has been given the value of 29, 
and has a name of “Folder”. The parent DirlD of re- 
cord 92 has been given the value 2 because directory 92 
is an offspring of directory 91 in the structure 90. Direc- 
tory record 91 has a directory DirID value of 2, with a 
corresponding name of “Volume”. Because directory 
91 is a root directory, the parent DirID value hos been 
given the value of 1, wherein the value | refers to the 
foundation of the filing system itself. 

A file record, such as file records 93-96, is also com- 
prised of a key 113 and an information segment 114, 
wherein key 113 is also comprised of a parent DirID 
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7 
value and a name. However, in the information segment 
114, the descriptive location information for the actual 
stored file dsta is maintained as well as a unique file 
number. The information segments 114 of file records 
93-96 contain the descriptive location of the actual 
stored data information. 

File record 94, having a file name of B, and file record 
93, having a file name A, both have a parent DirID 
value of 2. The parent DirlID value of 2 signifies that 
files A and B are direct offsprings of directory “Vol- 
ume” having a DirlD value of 2. File 95, having a name 
C, and file 96, having a name D, have parent DirED 
vatues of 29, which seflect the origination of files C and 
D as offtprings of directory 29 labeled “Folder”, having 
a DirlD value of 29. Therefore, by looking at any file or 
a directory record’s key 103, the stored information 
provides the identification of the name of that particular 
record as well as the D'sID value of the parent node. 

To provide the interconnection of the different 
branches, a thread record 102 is provided for each di- 20 
rectory. The key of a thread record contains a DirID 
value and a null-name, which is equivalent to having no 
name at all. In the example of FIG. 6, thread record 108 
provides the connection between the directory 
“Folder” and files C and D. In the key 111 of thread 
record 108, only the directory DirID value of “Folder” 
is given. In the information segment 112 of thread re- 
cord 108, the DirID of “Folder’’s parent and the direc- 
tory’s name “Folder” are given. Therefore, when file C, 
having a parent DirID 29 attempts to link to its immedi- »” 
ate parent directory 92, which has a DirID of 29, the 

thread record 108 provides the name (Folder) of the 
parent directory 92, as well as the parent DirID value of 
directory 92, which is equal to 2. 

Equivalently thread record 109 provides the name 35 
(Volume) of directory 91 as well as its parent directory 
DirlD value for the three offsprings 92-94 of directory 

- 91. By having directory records 91-92, ‘file records 
. 93-96, along with thread records 106-109 for each di- 
< rectory, the cataloging structure 90 is interconnected 40 
‘into a HFS, wherein the descriptive location informa- 
tion for the actual stored data is stored in file records 
91-92 as shown in the structure 97 of FIG. 6. 

By implementing the cataloging structure 90 using a 
B-Tree structure, the hierarchical 
structure 90 is easily stored in the leaf nodes of a B-Tree 
of the earlier description. For example, when file C is to 
be accessed by a computer, the system will implement a 
B-Tree search. Referring to the catalog example 90 of 
FIG. 6, when file with name C is to be found, the search 
path must be specified for this search. This can be given 
in terms of a sequence of the names of all directories on 
the path from the root to the said file, thus “Volume”, 
followed by “Folder”, and finally “C”. The search 
begins by finding the directory record in the Catalog 
B-Tree that corresponds to “Volume”. Its name is 
“Volume” and since it is the root, its parent DirID 
value is 1. The catalog B-Tree is searched for a direc- 
tory record with key <1> Volume; thus, directory 

record 91 is found. Its information segment then pro- 
vides the DirlD value 2 of this directory. Now a search 
is made through the B-Tree for the record with key 
_<2> Folder which leads to the directory record 92, 
whose information segment provides this directory’s 
DirlD value of 29. Thus now a search of the B-Tree is 
made to find the data record with key <29>C. This 
immediately leads the search to the file record 98, 
whose information segment contains the information 


{$ 


0 


8 
about the physical location of the data contained in the 
desired file. 

It will be appreciated that the specification of the file 
of the above example could start with the DirID value 
of any directory on the path from the root to the desired 
file, and would then consist of this DirID value and the 
sequence of names of the directories on the balance of 
the path from that directory to the desired file. The 
search mechanism followed is an obvious variant of the 
one indicated above. 

Although cataloging structure 90 is a simplified struc- 
ture and FIG. 6 only shows the presence of a single 
structure having a single root directory 91, a cataloging 
structure may be enlarged manyfold. The preferred 
embodiment uses one HFS cataloging structure per 
memory device, such as a disk. However, such a disk 
can be partitioned and an HFS catalog assigned to each 


tion of 43 
n 


be stored in a single extent having a contiguous memory 
allocation space. However, due to the size of certain 
files, as well as subsequent additions, deletions and mod- 
ifications to existing files, files are usually stored in more 
than one allocated area of the memory. Except in the 
case of preallocated or small files, the contents of a 
particular file are usually stored in more than one ex- 
tent, separated into non-contiguous sections on & vol- 
ume. Each file extent can be identified by an extent 
descriptor. Thus, the complete location information of a 
particular file is a sequential extents list consisting of the 
extent descriptors of the various extents containing the 
file's data. . 

The file extents list of the present invention is orga- 
nized also as a B-Tree, known as the File Extents B- 
Tree, and records the volume location and size of the 
various extents that comprise the files. Although most 
any memory allocation system can employ the file ex- 
tents record of the present invention, a specific memory 
allocation system is described to illustrate the file ex- 
tents record of the preferred embodiment. 
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Referring to FIG. 7, a memory volume 120 which is 
a portion of a memory device, such as a hard disk, is 
shown. Volume 120 is segmented into a number of logi- 
cal blocks 126. Typically, each logical block 126 is 
comprised of a predetermined fixed number of bytes, 
such as 512 bytes for the preferred embodiment. A fixed 
number of logical blocks starting at block 0 and ending 
at block n is reserved for volume information. The 
balance of the memory device starting at block n+1 is 
available for data storage and this storage area is sepa- 
rated into allocation units, wherein each allocation unit 
is comprised of one or moie contiguous logical blocks. 

Volume 120 includes four areas 121-124. System 
start-up ares 121 contains certain configurable system 
parameters which are well-known in operating a disk or 
other memory devices. Volume information area 122 
contains information regarding the housekeeping pa- 
rameters of the volume, :uch as number and size of each 
allocation unit. Volume bit map 123 maintains record of 


each allocation unit on the volume 120 and uses a bit 20 


map to designate use or non-use of each allocation unit. 

Commencing at block n+1, a file content area 124 
extends to the end of the Volume 120. File content area 
124 is separated into a number of allocation units, 
wherein each allocation unit is comprised of a fixed 
number of logical blocks. While the bit map 123 main- 
tains volume space management, it does not provide file 
mapping. The file mapping function is provided by the 
file extents lists. 

Referring also to FIG. 8, a portion of file contents 
area 124 is shown containing information attributed to a 
file labeled file E. In this hypothetical example the en- 
tire contents of file E are ted into seven extents 
125-131. The first portion of the file is stored in base 
extent 125, the subsequent portions of the file are dis- 
tributed accordingly in extents 2-7 which are labelled 
126-131. File E has seven extents 125-131 which are 
not physically contiguous. To maintain file extents in- 
formation an extent descriptor 140 is used for the base 


extent 125 and each of the subsequent extents 126-131 40 


of file E. 

Extent descriptor 140 is comprised of a starting allo- 
cation unit number 141 and number of allocation units 
142. File E extents list 135, which is comprised of seven 
extent descriptors 128a-131a, provides information as 
to the address and length of each extent 123-131 of file 
E. For example, the fourth extent 128, which has a 
starting allocation address of 189 and is only two alloca- 
tion blocks long, has a value of 189 in field 141 and a 
value of 2 in field 142 of descriptor 128. 

Extents descriptors of all files in a volume are main- 
tained in the present invention in the data records con- 
tained in the leaf nodes of B-Tree such as of FIGS. 3-5. 
This tree is known as the File Extents B-Tree and is a 
separate B-Tree from the earlier described catalog B- 
Tree. Each data record of this extents B-Tree consists of 
a key and an information segment as before in the dis- 
cussion of FIGS. 3-3. The information segment of a File 
Extents B-Tree data record is comprised of a sequence 
of extents descriptors of a particular file. The maximum 
number of extents descriptors in such a record can vary 
from implementation to implementation, but in the pre- 
ferred embodiment is set to three. The key of the File 
Extents B-Tree record consists of two fields: the file 
number of the particular file and the file relative posis- 
tion of the starting block of the first extent descriptor in 
that record. These extents records are kept in the leaf 
nodes of the Extents B-Tree sorted in ascending order 


10 
first on the file number field and then on the file relative 
position of the starting block. This allows efficient 
search throug) the B-Tree for the location information 
of data at 2 particular file relative position. 

In actuality, the preferred embodiment stores three 
extents descriptors, base plus two subsequent extents 
descriptors, the information data segment 114 of the 
file’s catalog B-Tree record such as 94 of FIG. 6. There- 
fore, in the example of FIG. 8, extent descriptors 1252, 
126a and 127a are kept in the information segment of 
the cataloging structure and extents 128a-131a are kept 
in the File Extents B-Tree as shown in FIG. 9. Permit- 
ting limited extent information to be kept in the data 
segments of a cataloging structure permits faster access 
to data. Only when a file contains four extents or more, 
will it need to consult the File Extents B-Tree. It should 
be appreciated that the number of extents which are 
kept in the file’s Catalog B-Tree record without using a 
File Extents B-Tree is arbitrary and can be changed 
without departing from the spirit and scope of the in- 
vention. 

Also referring to FIG. 9, it shows a catalog file re- 
cord 145 and File Extents B-Tree records 143 and 144. 
As.explained in the structure of B-Trees of the present 
invention, each record 143 and 144 is comprised of a 
key 148 and 149 and extents list 146 and 147, respec- 
tively. To locate a certain portion of the data of a partic- 
ular file, first the Catalog B-Tree is searched for the 
corresponding file record. From this file record’s infor- 
mation segment, the file number is extracted. Also, the 
first three extent descriptors in the information segment 
of the catalog B-Tree file record are examined. If the 
required file data is contained within the corresponding 
extents, then the location information is. now readily 
available. If however, the desired file data is located in 
extents beyond the three in the catalog’s file record, 
then a search is made of the File Extents B-Tree using as 
a search key the file number and the computed file 
relative block position of the desired data. This search 
will lead to the file extent’s B-Tree record containing 
the desired location information. 

The example using file E is comprised of 22 blocks 
and having an arbitrary file number equal to 20. The 
extent descriptors contained in the catalog file record 
148 for file E provide the location information for the 
first 3 extents which in turn comprises the first 9 blocks 
(3+-5+1) of the file. The location information for the 
remaining 13 blocks (2+3-+1+7) of the file is con- 
tained in two date records 143 and 144 within the File 
Extents B-Tree. Assume that the desired data is at file 
relative block position 13 within file E. The extent de- 
scriptors contained in the file’s catalog record are exam- 
ined first. Since relative block 13 is greater than the 
number of blocks located by the extent descriptors in 
the file’s catalog record, the File Extent B-Tree is 
searched. The key used for the B-Tree search for rela- 
tive block position 13 is <20,13>. 

Since the key value of “13" is greater than the value 
“9” of key 148 for the first Files Extents B-Tree record 
143 for file E and is less than the value "15" of key 149 
for the second record 144, the search results with a “not 
found” result but positions to the second B-Tree record 
144. By retrieving the previous record 143 of key 144, 
the extent descriptor for relative block 13 is obtained. 
The value of “9” for key 148 is derived because extents 
list 146 starts at the tenth relative block (allocation unit 
number 9). The value of “15" for key 149 is derived 
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because extents list 147 starts at the sixteenth relative 
block (allocation unit number 15). 


IMPLEMENTATION 


The HFS of the present invention is implemented ina 5 
computer which is coupled to a memory device, such as 
a disk, having an ability of storing millions of bits of 
information, although any storage medium can use the 
HFS. Typically, the HFS of the present invention pro- 
vides the cataloging of various groupings of data, such 
as files, which are stored on the disk. 

The prefecred embodiment implements data storage 
by tos use of a cataloging structure previously de- 
scribed to catalog data stored on 2 large capacity mem- 
ory device. It also maintains a file extents record of up 
to three extents per file in the catalog. Subsequent ex- 
tent information is stored in a separate file extents re- 
cord. Both the catalog reourd and the extents record are 
maintained using two B-Trees of the earlier described 
B-Tree structure. 

The HFS as described in the preferred embodiment is 
controlled by a combination of hardware and software 
in a computer system. The HFS controlling routines are 
stored in a seperate storage device than the device used 
for storing the actual data. The preferred embodiment 
stores the routines in a read only memory (ROM), al- 
though most any storage medium may be used. 

Thus, a hierarchical filing system for use with a large 
capacity memory device in described. 

We claim: 

1. In a computer, a hierarchial filing system to pro- 
vide cataloging and retrieval of data stored on a storage 
device, said hiererchial filing system comprising: 


a memory for storing a program for said cataloging 55 
and retrieval; 


a processor coupled to said memory and said storage 
device for processing an organizing means to cata- 
log and retrieve said data; said processor compris- 


ing; 40 
said program for organizing said data on said storage 


device into a hypothetical catalog which has a root 
directory, a plurality of branching directories ar- 
ranged at various subsequent levels from said root 
directory, wherein some of said branching directo- 45 
ries branch from other of said branch directories; 
said branching directories being interconnected 
such that for each of said branching directories 
there is only a singular path from itself to said root 
directory; and wherein some of said 
directories have at leest one file, each file corre- 
sponding to a representation of a predetermined 
portion of said stored data; 

an assigning means for assigning a unique identifica- 
tion value to said root directory and each of said 
branching directories, and assigning an identifica- 
tion name to each of said files, root directory and 
branching directories, wherein each of said branch- 
ing directories and files are each provided with a | 
key comprised of its identification name and its 
next higher level directory identification value; 

a list forming means for forming a linear list of files 
and directory entries such that said file and direc- 
tory entries are ordered by said keys, such that said 
root directory being the highest level and files 
heing the lowest level; and said interconnection of 
each of said singular path is provided by each file 
and branching directory identification name being 
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associated with directory identification value of its - 
next higher level; 

a structure forming means for forming a B-Tree in- 
dexing structure having a beginning node, a plural- 
ity of indexing nodes and a plurality of terminating 
nodes, and wherein said linear list is stored in said 
terminating nodes of said B-Tree indexing struc- 
ture. 

2. The hierarchial filing system defined in claim 1, 
wherein said memory for storing said program is a read 
only memory. 

3. In a computer system where data is to be cata- 
logued when stored into a memory device, a method 
performed by the computer syatem for providing a 


5 hierarchial filing system to catalogue said data into a 


volume of said memory device for subsequent retrieval, 
comprising the steps of: 
creating a root directory, a plurality of subdirectories 
and a plurality of files; 
organizing said root directory, subdirectories and 
files into a hypothetical catalog wherein said root 
directory is at a topmost level and said subdirecto- 
ries are arranged at various subsequent levels from 
said root directory, some of said subdirectories 
branch from other of said subdirectories, but said 
subdirectories being interconnected such that for 
each of said subdirectories there is only a singular 
, and wherein 


tion value; 
forming a linear list of files and subdirectory entries 
that said file and subdirectory entries are or- 
said keys, such that said root directory 
highest level and files being the lowest 
interconnection of each of said sin- 


plurality of terminal nodes; 

storing said linear list in said terminal nodes of said 
B-Tree structure in alphanumerical order accord- 
ing to said numerical directory value; 

assigning said identification name of a given file toa 
respective portion of said data; 

storing said data; 

placing memory location information in said files, 
wherein for each given file its memory location 
information locates its respective portion of said 
data stored in said memory device. 
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4. The method as described in claim 3 wherein said 
step of forming said B-Tree indexing structure further 
comprises thé step of forming a B-Tree structure 
wherein said beginning node comprises a root node of 
said B-Tree, said indexing nodes comprise branch nodes 
of said Be-Tree, and said terminal nodes comprise leave 
nodes of said B-Tree. 

S. The method as described in claim 4 wherein said 
step of placing location information in said files com- 
prises the step of providing a plurality of extent point- 
ers, cach extent pointer pointing to a location of a por- 
tion of said data stored in said memory device corre- 
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sponding to said given file such that non-contiguous 
data segn:ents are made to correspond to each said file. 

6. The method as described in claim § further com- 
prising the step of forming a second B-Tree structure to 
store a linear list of additional extent pointers for those 
files which have more extent pointers than that which 
can be stored in each file, said linear list of additional 
extent pointers being stored in terminal nodes of said 
second B-Tree structure by having each additional ex- 


tent pointer stored in one of said terminal nodes. 
e . ® e 2 
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